智能论文笔记

A Review of the Convergence of 5G/6G Architecture and Deep Learning

Olusola T. Odeyomi , Olubiyi O. Akintade , Temitayo O. Olowu , Gergely Zaruba

分类：机器学习 | 人工智能

2022-08-16

5G建筑和深度学习的融合在无线通信和人工智能领域都获得了许多研究兴趣。这是因为深度学习技术已被确定为构成5G体系结构的5G技术的潜在驱动力。因此，关于5G架构和深度学习的融合进行了广泛的调查。但是，大多数现有的调查论文主要集中于深度学习如何与特定的5G技术融合，因此，不涵盖5G架构的全部范围。尽管最近有一份调查文件似乎很强大，但对该论文的评论表明，它的结构不佳，无法专门涵盖深度学习和5G技术的收敛性。因此，本文概述了关键5G技术和深度学习的融合。讨论了这种融合面临的挑战。此外，还讨论了对未来6G体系结构的简要概述，以及如何与深度学习进行融合。

translated by 谷歌翻译

Automatic vehicle trajectory data reconstruction at scale

Yanbing Wang , Derek Gloudemans , Zi Nean Teoh , Lisa Liu , Gergely Zachár , William Barbour , Daniel Work

分类：计算机视觉

2022-12-15

Vehicle trajectory data has received increasing research attention over the past decades. With the technological sensing improvements such as high-resolution video cameras, in-vehicle radars and lidars, abundant individual and contextual traffic data is now available. However, though the data quantity is massive, it is by itself of limited utility for traffic research because of noise and systematic sensing errors, thus necessitates proper processing to ensure data quality. We draw particular attention to extracting high-resolution vehicle trajectory data from video cameras as traffic monitoring cameras are becoming increasingly ubiquitous. We explore methods for automatic trajectory data reconciliation, given "raw" vehicle detection and tracking information from automatic video processing algorithms. We propose a pipeline including a) an online data association algorithm to match fragments that are associated to the same object (vehicle), which is formulated as a min-cost network flow problem of a graph, and b) a trajectory reconciliation method formulated as a quadratic program to enhance raw detection data. The pipeline leverages vehicle dynamics and physical constraints to associate tracked objects when they become fragmented, remove measurement noise on trajectories and impute missing data due to fragmentations. The accuracy is benchmarked on a sample of manually-labeled data, which shows that the reconciled trajectories improve the accuracy on all the tested input data for a wide range of measures. An online version of the reconciliation pipeline is implemented and will be applied in a continuous video processing system running on a camera network covering a 4-mile stretch of Interstate-24 near Nashville, Tennessee.

translated by 谷歌翻译

Industry-Scale Orchestrated Federated Learning for Drug Discovery

Martijn Oldenhof , Gergely Ács , Balázs Pejó , Ansgar Schuffenhauer , Nicholas Holway , Noé Sturm , Arne Dieckmann , Oliver Fortmeier , Eric Boniface , Clément Mayer

分类：机器学习 | (统计)机器学习

2022-10-17

To apply federated learning to drug discovery we developed a novel platform in the context of European Innovative Medicines Initiative (IMI) project MELLODDY (grant n{\deg}831472), which was comprised of 10 pharmaceutical companies, academic research labs, large industrial companies and startups. The MELLODDY platform was the first industry-scale platform to enable the creation of a global federated model for drug discovery without sharing the confidential data sets of the individual partners. The federated model was trained on the platform by aggregating the gradients of all contributing partners in a cryptographic, secure way following each training iteration. The platform was deployed on an Amazon Web Services (AWS) multi-account architecture running Kubernetes clusters in private subnets. Organisationally, the roles of the different partners were codified as different rights and permissions on the platform and administrated in a decentralized way. The MELLODDY platform generated new scientific discoveries which are described in a companion paper.

translated by 谷歌翻译

A Snapshot of the Frontiers of Client Selection in Federated Learning

Gergely Dániel Németh , Miguel Ángel Lozano , Novi Quadrianto , Nuria Oliver

分类：人工智能 | 机器学习

2022-09-27

Federated learning (FL) has been proposed as a privacy-preserving approach in distributed machine learning. A federated learning architecture consists of a central server and a number of clients that have access to private, potentially sensitive data. Clients are able to keep their data in their local machines and only share their locally trained model's parameters with a central server that manages the collaborative learning process. FL has delivered promising results in real-life scenarios, such as healthcare, energy, and finance. However, when the number of participating clients is large, the overhead of managing the clients slows down the learning. Thus, client selection has been introduced as a strategy to limit the number of communicating parties at every step of the process. Since the early na\"{i}ve random selection of clients, several client selection methods have been proposed in the literature. Unfortunately, given that this is an emergent field, there is a lack of a taxonomy of client selection methods, making it hard to compare approaches. In this paper, we propose a taxonomy of client selection in Federated Learning that enables us to shed light on current progress in the field and identify potential areas of future research in this promising area of machine learning.

translated by 谷歌翻译

Proximal Point Imitation Learning

Luca Viano , Angeliki Kamoutsi , Gergely Neu , Igor Krawczuk , Volkan Cevher

分类：机器学习

2022-09-22

这项工作开发了具有严格效率的新算法，可确保无限的地平线模仿学习（IL）具有线性函数近似而无需限制性相干假设。我们从问题的最小值开始，然后概述如何从优化中利用经典工具，尤其是近端点方法（PPM）和双平滑性，分别用于在线和离线IL。多亏了PPM，我们避免了在以前的文献中出现在线IL的嵌套政策评估和成本更新。特别是，我们通过优化单个凸的优化和在成本和Q函数上的平稳目标来消除常规交替更新。当不确定地解决时，我们将优化错误与恢复策略的次级优势联系起来。作为额外的奖励，通过将PPM重新解释为双重平滑以专家政策为中心，我们还获得了一个离线IL IL算法，该算法在所需的专家轨迹方面享有理论保证。最后，我们实现了线性和神经网络功能近似的令人信服的经验性能。

translated by 谷歌翻译

Subdiffusive semantic evolution in Indo-European languages

Bogdán Asztalos , Gergely Palla , Dániel Czégel

分类：自然语言处理

2022-09-10

单词如何改变他们的含义？尽管语义演化是由多种不同的因素（包括语言，社会和技术方面的）驱动的，但我们发现，有一项法律在五种主要的印欧语语言中普遍存在：这种语义演化非常宽容。使用控制基础对称性的直觉分布语义嵌入的自动管道，我们表明单词遵循含义空间中的随机轨迹，具有异常扩散指数$ \ alpha = 0.45 \ pm 0.05 \ pm 0.05 \ pm 0.05 $ 0.05 $，相反，与扩散的粒子相比之下\ alpha = 1 $。随机化方法表明，在语义变化方向上保留时间相关性是为了恢复强烈延伸的行为所必需的。但是，变化大小的相关性也起着重要作用。我们此外表明，在数据分析和解释中，强大的亚扩散是一种强大的现象，例如选择拟合位移平均值或平均单个单词轨迹的最佳拟合指数的选择。

translated by 谷歌翻译

Online Learning with Off-Policy Feedback

Germano Gabbianelli , Matteo Papini , Gergely Neu

分类：机器学习 | (统计)机器学习

2022-07-18

我们研究了在偏见的可观察性模型下，在对抗性匪徒问题中的在线学习问题，称为政策反馈。在这个顺序决策问题中，学习者无法直接观察其奖励，而是看到由另一个未知策略并行运行的奖励（行为策略）。学习者必须在这种情况下面临另一个挑战：由于他们的控制之外的观察结果有限，学习者可能无法同样估算每个政策的价值。为了解决这个问题，我们提出了一系列算法，以保证任何比较者政策与行为政策之间的自然不匹配概念的范围，从而提高了对观察结果良好覆盖的比较者的绩效。我们还为对抗性线性上下文匪徒的设置提供了扩展，并通过一组实验验证理论保证。我们的关键算法想法是调整最近在非政策强化学习背景下流行的悲观奖励估计量的概念。

translated by 谷歌翻译

BiometricBlender: Ultra-high dimensional, multi-class synthetic data generator to imitate biometric feature space

Marcell Stippinger , Dávid Hanák , Marcell T. Kurbucz , Gergely Hanczár , Olivér M. Törteli , Zoltán Somogyvári

分类：机器学习 | 人工智能

2022-06-21

缺乏自由获得的（现实生活或合成）高或超高维度的多级数据集可能会阻碍对特征筛查的快速增长的研究，尤其是在生物识别技术领域，在这种情况下，此类数据集使用很常见。本文报告了一个名为Biometricblender的Python软件包，它是一种超高维，多级合成数据生成器，可基于广泛的功能筛选方法进行基准测试。在数据生成过程中，用户可以控制混合特征的总体实用性和相互关系，因此合成特征空间能够模仿真实生物识别数据集的关键属性。

translated by 谷歌翻译

Collaborative Drug Discovery: Inference-level Data Protection Perspective

Balazs Pejo , Mina Remeli , Adam Arany , Mathieu Galtier , Gergely Acs

分类：机器学习

2022-05-13

制药行业可以更好地利用其数据资产来通过协作机器学习平台虚拟化药物发现。另一方面，由于参与者的培训数据的意外泄漏，存在不可忽略的风险，因此，对于这样的平台，必须安全和隐私权。本文介绍了在药物发现的临床前阶段进行协作建模的隐私风险评估，以加快有前途的候选药物的选择。在最新推理攻击的简短分类法之后，我们采用并定制了几种基础情况。最后，我们用一些相关的隐私保护技术来描述和实验，以减轻此类攻击。

translated by 谷歌翻译

Generalization Bounds via Convex Analysis

Gergely Neu , Gábor Lugosi

分类： (统计)机器学习 | 机器学习

2022-02-10

自从Russo和Zou（2016,2019）和Xu and Raginsky（2017）的著名作品以来，众所周知，监督学习算法的概括性错误可以根据其输入和输出，输出和输出之间的相互信息来界定。鉴于任何固定假设的丧失都具有亚高斯的尾巴。在这项工作中，我们将此结果推广到Shannon的共同信息的标准选择之外，以衡量输入和输出之间的依赖性。 Our main result shows that it is indeed possible to replace the mutual information by any strongly convex function of the joint input-output distribution, with the subgaussianity condition on the losses replaced by a bound on an appropriately chosen norm capturing the geometry of the dependence measure 。这使我们能够得出一系列的概括范围，这些范围是全新的，或者增强了以前已知的范围。示例包括按$ p $ norm差异和Wasserstein-2距离表示的界限，这些距离分别适用于重尾损失分布和高度平滑的损失功能。我们的分析完全基于来自凸分析的基本工具，通过跟踪与依赖度量和损失函数相关的潜在功能的增长。

translated by 谷歌翻译